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(57)Abstract: 

PURPOSE: To provide a compact character string 
retrieving device in which the state transition table of a 
little memory amount can be realized by decreasing an 
unnecessary slot. 

CONSTITUTION: This device is equipped with a 
character string storage means 105 which stores a text, 
character code converting means 4000 which fetches 
each character from the text read from the character 
string storage means, outputs a code corresponding to 
the character when a character code is the character 
code included in a retrieval term, and outputs a specific 
code when the character code is not the character code 
included in the retrieval term, retrieval control means 
101 which converts the character code of the 
preliminarily applied retrieval term into the corresponding 
code, and character string collating means 102 which 
collates whether or not the code string corresponding to 
the plural retrieval terms transmitted from the retrieval 
control means is present in the code string outputted 
from the character code converting means in a batch. 
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[0027] 

[Effect] The principle of matching in the present invention 
employing the above means is explained. 

[0028] In the search control means 101, first, the 

character codes designated to the search term are found, and 
a serial code, for example, is assigned to these character codes. 
Such a character code correspondence table is set for the 
character code converter means 4000. In addition, an automaton 
generated from the search term, which is converted into the 
serial number, is set for the letter string matching means 102. 
[0029] In the search, the text read out from the letter 

string storing means 102 is loaded letter-by-letter into the 
character code conversion means 4000, and it is determined 
whether or not the character code is included in the search term. 
If the character code is included, in the serial nxamber 
corresponding to the letter is output, and if not, a given number, 
for example O(zero), is output. 

[0030] Whether or not a code string corresponding to a 

plurality of search terms is present among the code strings 
output from the character code conversion means 4000 is found 
collectively by the matching in the letter string matching means 
102. 

[0031] The above principle is explained using a specific 

example . 

[0032] First, the setting information generation method 

is explained. 

[0033] In the present example, two terms '"CAT" and ""DOG" 

are employed as search terms as in the automaton in Fig. 4 . There 
are six types of letters used in these search terms. The letters 
"'C", "^T", ""D", ""O", and ^"G" are used in the search terms 

corresponding to^MD", 'M2)'', "(3)'^ ^M4)", ^M5) " and "M 6) 
respectively, and the letters not used in the search terms 
correspond to "MO)'''. This character code conversion table is 
set for the character code conversion means 4000. The character 
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code conversion table of the present example set in the above 
manner is shown in Fig. 7. 

[0034] In the same way, the search terms are converted to 

'MD (2) (3)" corresponding to "^CAT" and ^M4) (5) (6)" 
corresponding to ^^DOG''. The automaton generated from these 
converted search terms in the above manner is shown in Fig. 8. 
The letters assigned to the transition of the automaton of Fig. 
4 are converted into the corresponding serial numbers. This 
automaton is set to the letter string matching means 102. 
[0035] In the present example, types of the character code 

can be reduced to 7 types from ^MO)'' to ^M6)", which is 
approximately one-fortieth of the conventional 256 types of 
this code conversion. Therefore, the state transition table 
requires approximately one-fortieth of the capacity. 
[0036] The matching operation is explained next. In the 

following description, ^^HOTDOG'' is used as an example of text. 
[0037] First, when ^^H" is input to the character code 

conversion means 4000, the corresponding ^MO) is output by the 
character code conversion table shown in Fig. 7. 
[0038] Next, when is input, the corresponding ^M5)" 

is output by the character code conversion table. 
[0039] In the same manner, "^T", "^D'', ^"C, and ^"G" are input 

one after another, "M3)'', "M4)", and ^M6)'' are obtained 

in series as outputs. In other words, 'MO) (5) (3) (4) (5) (6)" 
is transmitted to the letter string matching means 102 as output 
text . 

[0040] In the letter string matching means 102, 'Ml) (2) 

(3)" corresponding to ''CAT'' or "(4) (5) (6)" corresponding to 
"DOG" are searched in the output texts transmitted from the 
character code conversion means 4000 based on the automaton 
shown in Fig. 8. Here, the input of "(0) (5) (3) (4) (5) (6)", 
"(4) (5) (6)" corresponding to "DOG" is matched. 
[0041] By the above operations, the state transition table 

can be implemented in a memory capacity several tenths that of 
the prior art, and it is possible to provide a compact letter 
string search apparatus . 
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